class: center, middle, inverse, title-slide .title[ # STA 235H - Final Trivia ] .subtitle[ ## Fall 2023 ] .author[ ### McCombs School of Business, UT Austin ] --- <!-- <script type="text/javascript"> --> <!-- MathJax.Hub.Config({ --> <!-- "HTML-CSS": { --> <!-- preferredFont: null, --> <!-- webFont: "Neo-Euler" --> <!-- } --> <!-- }); --> <!-- </script> --> <style type="text/css"> .small .remark-code { /*Change made here*/ font-size: 80% !important; } .tiny .remark-code { /*Change made here*/ font-size: 80% !important; } </style> # Rules of Final Trivia 1) **.darkorange[Form groups]**: 2 or 3 students (no more, no less). -- 2) **.darkorange[Choose a name for your group]**: You can be funny or classic. -- 3) **.darkorange[You need to complete all the questions]**: It doesn't matter if you don't know the answer! Make your best guess. -- 4) **.darkorange[Ask questions]**: I will give you time for your team to complete each questions; after the time is up, you will submit your questions and we will check answers. - If something isn't clear, **now is the time to ask**. -- 5) **.darkorange[There are prizes]**: At the end of the session, we will crown the teams that perform the best. If there is a tie in scores, the team that submits their answers the fastest moves up. -- .center[**.darkorange[Note:]** All slides and answers will be posted on Wednesday at 4pm. Make sure to take notes!] --- background-position: 50% 50% class: left, bottom, inverse .big[ Regressions ] --- # Amazon prices In this question, we are looking at luggage prices in Amazon US. We have data scraped from the website: ```r amz = read.csv("https://raw.githubusercontent.com/maibennett/website_github/master/exampleSite/content/files/data/amz_luggage.csv") amz %>% select(-title) %>% head(.) ``` ``` ## uid asin stars reviews boughtInLastMonth isBestSeller price ## 1 297768 B004DPRTSE 3.7 138 0 0 67.90 ## 2 297745 B0BSBX2XPR 3.6 379 100 0 109.11 ## 3 295361 B0BG96C62P 4.7 2835 0 0 100.42 ## 4 295069 B00NF9HISK 3.3 825 0 0 57.56 ## 5 297025 B09VD9FGDH 2.5 221 0 0 70.11 ## 6 296801 B0B3LZ9KNH 2.5 3368 200 0 102.42 ``` We want to find the association between covariates and the outcome (`price`). --- # Question 1 .small[ ```r lm1 = lm(price ~ isBestSeller*stars, data = amz) summary(lm1) ``` ``` ## ## Call: ## lm(formula = price ~ isBestSeller * stars, data = amz) ## ## Residuals: ## Min 1Q Median 3Q Max ## -115.054 -26.893 -1.386 25.022 137.700 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 99.3724 1.8165 54.707 <2e-16 *** ## isBestSeller 29.4236 17.1668 1.714 0.0866 . ## stars 4.9546 0.5299 9.350 <2e-16 *** ## isBestSeller:stars 11.5092 4.9535 2.323 0.0202 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 38.4 on 4037 degrees of freedom ## Multiple R-squared: 0.0586, Adjusted R-squared: 0.0579 ## F-statistic: 83.77 on 3 and 4037 DF, p-value: < 2.2e-16 ``` ] --- # Question 2 .small[ ```r lm1 = lm(price ~ isBestSeller*stars, data = amz) summary(lm1) ``` ``` ## ## Call: ## lm(formula = price ~ isBestSeller * stars, data = amz) ## ## Residuals: ## Min 1Q Median 3Q Max ## -115.054 -26.893 -1.386 25.022 137.700 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 99.3724 1.8165 54.707 <2e-16 *** ## isBestSeller 29.4236 17.1668 1.714 0.0866 . ## stars 4.9546 0.5299 9.350 <2e-16 *** ## isBestSeller:stars 11.5092 4.9535 2.323 0.0202 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 38.4 on 4037 degrees of freedom ## Multiple R-squared: 0.0586, Adjusted R-squared: 0.0579 ## F-statistic: 83.77 on 3 and 4037 DF, p-value: < 2.2e-16 ``` ] --- # Question 3 .small[ ```r lm1 = lm(price ~ isBestSeller*stars, data = amz) summary(lm1) ``` ``` ## ## Call: ## lm(formula = price ~ isBestSeller * stars, data = amz) ## ## Residuals: ## Min 1Q Median 3Q Max ## -115.054 -26.893 -1.386 25.022 137.700 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 99.3724 1.8165 54.707 <2e-16 *** ## isBestSeller 29.4236 17.1668 1.714 0.0866 . ## stars 4.9546 0.5299 9.350 <2e-16 *** ## isBestSeller:stars 11.5092 4.9535 2.323 0.0202 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 38.4 on 4037 degrees of freedom ## Multiple R-squared: 0.0586, Adjusted R-squared: 0.0579 ## F-statistic: 83.77 on 3 and 4037 DF, p-value: < 2.2e-16 ``` ] --- background-position: 50% 50% class: left, bottom, inverse .big[ Causal Inference ] --- # Does Academic Probation Work? Academic probation is a widely used tool by most universities to make sure students maintain minimum academic standards. In this section, we will analyze data from a large Canadian university regarding the effects of academic probation, originally used in Lindo, Sanders, and Oreopoulos’ (2010) paper, “Ability, Gender, and Performance Standards: Evidence from Academic Probation” .tiny[ ```r probation = read.csv("https://raw.githubusercontent.com/maibennett/website_github/master/exampleSite/content/files/data/probation.csv") ``` ] .pull-left[ .small[ - `creditsY`: Credits attempted in year Y = 1,2. - `credits_earnedY`: Credits earned in year Y = 1,2. - `GPA_yearY`: GPA at the end of year Y = 1,2. - `CGPA_yearY`: Cumulative GPA at the end of year Y = 1,2. - `sex`: Gender of the student (M: Male, F: Female). - `age_at_entry`: Age of the student when they first enrolled.]] .pull-right[ .small[ - `gradinY`: Student graduated in Y years, Y = 4, 5, or 6. - `left_school`: Whether the student left school or not after the first assessment. - `hsgrade_pct`: Percentile of graduation in their high school. - `probation_year1`: Whether the student was in academic probation by the end of year 1. - `suspended_year1`: Whether the student was suspended by the end of year 1.]] --- # Question 4 .small[ ```r summary(lm_robust(left_school ~ probation_year1, data = probation)) ``` ``` ## ## Call: ## lm_robust(formula = left_school ~ probation_year1, data = probation) ## ## Standard error type: HC2 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF ## (Intercept) 0.03755 0.0009849 38.13 5.691e-313 0.03562 0.03948 44360 ## probation_year1 0.07165 0.0038290 18.71 7.761e-78 0.06415 0.07916 44360 ## ## Multiple R-squared: 0.01481 , Adjusted R-squared: 0.01479 ## F-statistic: 350.2 on 1 and 44360 DF, p-value: < 2.2e-16 ``` ] --- # Question 5 .tiny[ ```r probation = probation %>% filter(left_school==0) summary(lm(GPA_year2 ~ probation_year1 + credits1 + credits_earned1 + GPA_year1 + factor(sex) + age_at_entry + hsgrade_pct, data = probation)) ``` ``` ## ## Call: ## lm(formula = GPA_year2 ~ probation_year1 + credits1 + credits_earned1 + ## GPA_year1 + factor(sex) + age_at_entry + hsgrade_pct, data = probation) ## ## Residuals: ## Min 1Q Median 3Q Max ## -3.3545 -0.3239 0.0646 0.3708 2.5300 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 1.0557184 0.0803853 13.133 < 2e-16 *** ## probation_year1 0.2827426 0.0132546 21.332 < 2e-16 *** ## credits1 -0.0069394 0.0120652 -0.575 0.5652 ## credits_earned1 0.0245169 0.0116370 2.107 0.0351 * ## GPA_year1 0.6971113 0.0059157 117.842 < 2e-16 *** ## factor(sex)M -0.0957468 0.0061847 -15.481 < 2e-16 *** ## age_at_entry -0.0248064 0.0041120 -6.033 1.63e-09 *** ## hsgrade_pct 0.0032055 0.0001307 24.529 < 2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.5817 on 38322 degrees of freedom ## (3857 observations deleted due to missingness) ## Multiple R-squared: 0.5139, Adjusted R-squared: 0.5139 ## F-statistic: 5789 on 7 and 38322 DF, p-value: < 2.2e-16 ``` ] --- # Question 6 .tiny[ ```r probation = probation %>% filter(left_school==0) summary(lm(GPA_year2 ~ probation_year1 + credits1 + credits_earned1 + GPA_year1 + factor(sex) + age_at_entry + hsgrade_pct, data = probation)) ``` ``` ## ## Call: ## lm(formula = GPA_year2 ~ probation_year1 + credits1 + credits_earned1 + ## GPA_year1 + factor(sex) + age_at_entry + hsgrade_pct, data = probation) ## ## Residuals: ## Min 1Q Median 3Q Max ## -3.3545 -0.3239 0.0646 0.3708 2.5300 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 1.0557184 0.0803853 13.133 < 2e-16 *** ## probation_year1 0.2827426 0.0132546 21.332 < 2e-16 *** ## credits1 -0.0069394 0.0120652 -0.575 0.5652 ## credits_earned1 0.0245169 0.0116370 2.107 0.0351 * ## GPA_year1 0.6971113 0.0059157 117.842 < 2e-16 *** ## factor(sex)M -0.0957468 0.0061847 -15.481 < 2e-16 *** ## age_at_entry -0.0248064 0.0041120 -6.033 1.63e-09 *** ## hsgrade_pct 0.0032055 0.0001307 24.529 < 2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.5817 on 38322 degrees of freedom ## (3857 observations deleted due to missingness) ## Multiple R-squared: 0.5139, Adjusted R-squared: 0.5139 ## F-statistic: 5789 on 7 and 38322 DF, p-value: < 2.2e-16 ``` ] --- background-position: 50% 50% class: left, bottom, inverse .big[ Prediction ] --- # Candy, candy, candy In this section, we will be predicting win percentage for candy bars! We have the following dataset for this: .tiny[ ```r candy = read.csv("https://raw.githubusercontent.com/maibennett/website_github/master/exampleSite/content/files/data/candy_r.csv") ``` ] .pull-left[ .small[ - `competitorname`: Name of the candy - `chocolate`: Is it chocolate? - `fruity`: Is it fruit flavored? - `caramel`: Is there caramel in the candy? - `peanutalmondy`: Does it contain peanuts, peanut butter or almonds? - `nougat`: Does it contain nougat? - `crispedricewafer`: Does it contain crisped rice, wafers, or a cookie component? ]] .pull-right[ .small[ - `hard`: Is it a hard candy? - `bar`: Is it a bar? - `pluribus`: Is it one of many candies in a bag/box? - `sugarpercent`: The percentile of sugar it falls under within the data set. - `pricepercent`: The unit price percentile compared to the rest of the set. - `popularity`: How popular the candy is (3 levels). - `winpercent`: The overall win percentage according to 269,000 matchups.]] --- # Question 7 <img src="data:image/png;base64,#f2023_sta235h_15_FinalTrivia_files/figure-html/rd_linear-1.svg" style="display: block; margin: auto;" /> --- # Question 8 Access the code <u>[here](https://www.magdalenabennett.com/files/data/Trivia/f2023_sta235h_15_FinalTrivia.R)</u> --- # Question 9 Access the code <u>[here](https://www.magdalenabennett.com/files/data/Trivia/f2023_sta235h_15_FinalTrivia.R)</u>